NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Supporting Complex Query Time Enrichment For Analytics

Ghosh, Dhrubajyoti; Gupta, Peeyush; Mehrotra, Sharad; Sharma, Shantanu (March 2023, 26th International Conference on Extending Database Technology (EDBT))

Full Text Available
Supporting Complex Query Time Enrichment For Analytics

https://doi.org/10.48786/EDBT.2023.08

Ghosh, Dhrubajyoti; Gupta, Peeyush; Mehrotra, Sharad; Sharma, Shantanu (January 2023, OpenProceedings.org)

{}
more » « less
JENNER: just-in-time enrichment in query processing

https://doi.org/10.14778/3551793.3551822

Ghosh, Dhrubajyoti; Gupta, Peeyush; Mehrotra, Sharad; Yus, Roberto; Altowim, Yasser (September 2022, Proceedings of the VLDB Endowment)

Emerging domains, such as sensor-driven smart spaces and social media analytics, require incoming data to be enriched prior to its use. Enrichment often consists of machine learning (ML) functions that are too expensive/infeasible to execute at ingestion. We develop a strategy entitled Just-in-time ENrichmeNt in quERy Processing (JENNER) to support interactive analytics over data as soon as it arrives for such application context. JENNER exploits the inherent tradeoffs of cost and quality often displayed by the ML functions to progressively improve query answers during query execution. We describe how JENNER works for a large class of SPJ and aggregation queries that form the bulk of data analytics workload. Our experimental results on real datasets (IoT and Tweet) show that JENNER achieves progressive answers performing significantly better than the naive strategies of achieving progressive computation.
more » « less
Full Text Available
A Case for Enrichment in Data Management Systems

https://doi.org/10.1145/3552490.3552497

Ghosh, Dhrubajyoti; Gupta, Peeyush; Mehrotra, Sharad; Sharma, Shantanu (July 2022, ACM SIGMOD Record)

We describe ENRICHDB, a new DBMS technology designed for emerging domains (e.g., sensor-driven smart spaces and social media analytics) that require incoming data to be enriched using expensive functions prior to its usage. To support online processing, today, such enrichment is performed outside of DBMSs, as a static data processing workflow prior to its ingestion into a DBMS. Such a strategy could result in a significant delay from the time when data arrives and when it is enriched and ingested into the DBMS, especially when the enrichment complexity is high. Also, enriching at ingestion could result in wastage of resources if applications do not use/require all data to be enriched. ENRICHDB's design represents a significant departure from the above, where we explore seamless integration of data enrichment all through the data processing pipeline - at ingestion, triggered based on events in the background, and progressively during query processing. The cornerstone of ENRICHDB is a powerful enrichment data and query model that encapsulates enrichment as an operator inside a DBMS enabling it to co-optimize enrichment with query processing. This paper describes this data model and provides a summary of the system implementation.
more » « less
Full Text Available
MIDE: accuracy aware minimally invasive data exploration for decision support

https://doi.org/10.14778/3551793.3551821

Ghayyur, Sameera; Ghosh, Dhrubajyoti; He, Xi; Mehrotra, Sharad (July 2022, Proceedings of the VLDB Endowment)

This paper studies privacy in the context of decision-support queries that classify objects as either true or false based on whether they satisfy the query. Mechanisms to ensure privacy may result in false positives and false negatives. In decision-support applications, often, false negatives have to remain bounded. Existing accuracy-aware privacy preserving techniques cannot directly be used to support such an accuracy requirement and their naive adaptations to support bounded accuracy of false negatives results in significant privacy loss depending upon distribution of data. This paper explores the concept of minimally-invasive data exploration for decision support that attempts to minimize privacy loss while supporting bounded guarantee on false negatives by adaptively adjusting privacy based on data distribution. Our experimental results show that the MIDE algorithms perform well and are robust over variations in data distributions.
more » « less
Full Text Available
PRISM: Private Verifiable Set Computation over Multi-Owner Outsourced Databases

https://doi.org/10.1145/3448016.3452839

Li, Yin; Ghosh, Dhrubajyoti; Gupta, Peeyush; Mehrotra, Sharad; Panwar, Nisha; Sharma, Shantanu (June 2021, ACM)

Full Text Available
A privacy-enabled platform for COVID-19 applications: poster abstract

https://doi.org/10.1145/3384419.3430594

August, Michael; Davison, Christopher; Diallo, Mamadou; Ghosh, Dhrubajyoti; Gupta, Peeyush; Graves, Christopher; Han, Shanshan; Holstrom, Michael; Khargonekar, Pramod; Kline, Megan; et al (November 2020, SenSys '20: Proceedings of the 18th Conference on Embedded Networked Sensor Systems)

Full Text Available

Search for: All records